Epileptic seizure prediction is an important research topic in the clinical epilepsy treatment, which can provide opportunities to take precautionary measures for epilepsy patients and medical staff. EEG is an commonly used tool for studying brain activity, which records the electrical discharge of brain. Many studies based on machine learning algorithms have been proposed to solve the task using EEG signal. In this study, we propose a novel seizure prediction models based on convolutional neural networks and scalp EEG for a binary classification between preictal and interictal states. The short-time Fourier transform has been used to translate raw EEG signals into STFT sepctrums, which is applied as input of the models. The fusion features have been obtained through the side-output constructions and used to train and test our models. The test results show that our models can achieve comparable results in both sensitivity and FPR upon fusion features. The proposed patient-specific model can be used in seizure prediction system for EEG classification.
He LI Yutaro IWAMOTO Xianhua HAN Lanfen LIN Akira FURUKAWA Shuzo KANASAKI Yen-Wei CHEN
Convolutional neural networks (CNNs) have become popular in medical image segmentation. The widely used deep CNNs are customized to extract multiple representative features for two-dimensional (2D) data, generally called 2D networks. However, 2D networks are inefficient in extracting three-dimensional (3D) spatial features from volumetric images. Although most 2D segmentation networks can be extended to 3D networks, the naively extended 3D methods are resource-intensive. In this paper, we propose an efficient and accurate network for fully automatic 3D segmentation. Specifically, we designed a 3D multiple-contextual extractor to capture rich global contextual dependencies from different feature levels. Then we leveraged an ROI-estimation strategy to crop the ROI bounding box. Meanwhile, we used a 3D ROI-attention module to improve the accuracy of in-region segmentation in the decoder path. Moreover, we used a hybrid Dice loss function to address the issues of class imbalance and blurry contour in medical images. By incorporating the above strategies, we realized a practical end-to-end 3D medical image segmentation with high efficiency and accuracy. To validate the 3D segmentation performance of our proposed method, we conducted extensive experiments on two datasets and demonstrated favorable results over the state-of-the-art methods.
Longjiao ZHAO Yu WANG Jien KATO Yoshiharu ISHIKAWA
Convolutional Neural Networks (CNNs) have recently demonstrated outstanding performance in image retrieval tasks. Local convolutional features extracted by CNNs, in particular, show exceptional capability in discrimination. Recent research in this field has concentrated on pooling methods that incorporate local features into global features and assess the global similarity of two images. However, the pooling methods sacrifice the image's local region information and spatial relationships, which are precisely known as the keys to the robustness against occlusion and viewpoint changes. In this paper, instead of pooling methods, we propose an alternative method based on local similarity, determined by directly using local convolutional features. Specifically, we first define three forms of local similarity tensors (LSTs), which take into account information about local regions as well as spatial relationships between them. We then construct a similarity CNN model (SCNN) based on LSTs to assess the similarity between the query and gallery images. The ideal configuration of our method is sought through thorough experiments from three perspectives: local region size, local region content, and spatial relationships between local regions. The experimental results on a modified open dataset (where query images are limited to occluded ones) confirm that the proposed method outperforms the pooling methods because of robustness enhancement. Furthermore, testing on three public retrieval datasets shows that combining LSTs with conventional pooling methods achieves the best results.
Yuto OMAE Yuki SAITO Yohei KAKIMOTO Daisuke FUKAMACHI Koichi NAGASHIMA Yasuo OKUMURA Jun TOYOTANI
In this article, a GUI system is proposed to support clinical cardiology examinations. The proposed system estimates “pulmonary artery wedge pressure” based on patients' chest radiographs using an explainable regression-based convolutional neural network. The GUI system was validated by performing an effectiveness survey with 23 cardiology physicians with medical licenses. The results indicated that many physicians considered the GUI system to be effective.
Ze Fu GAO Hai Cheng TAO Qin Yu ZHU Yi Wen JIAO Dong LI Fei Long MAO Chao LI Yi Tong SI Yu Xin WANG
Aiming at the problem of non-line of sight (NLOS) signal recognition for Ultra Wide Band (UWB) positioning, we utilize the concepts of Neural Network Clustering and Neural Network Pattern Recognition. We propose a classification algorithm based on self-organizing feature mapping (SOM) neural network batch processing, and a recognition algorithm based on convolutional neural network (CNN). By assigning different weights to learning, training and testing parts in the data set of UWB location signals with given known patterns, a strong NLOS signal recognizer is trained to minimize the recognition error rate. Finally, the proposed NLOS signal recognition algorithm is verified using data sets from real scenarios. The test results show that the proposed algorithm can solve the problem of UWB NLOS signal recognition under strong signal interference. The simulation results illustrate that the proposed algorithm is significantly more effective compared with other algorithms.
Daiki TODA Ren ANZAI Koichi ICHIGE Ryo SAITO Daichi UEKI
A method of radar-based contactless vital-sign sensing and electrocardiogram (ECG) signal reconstruction using deep learning is proposed. A radar system is an effective tool for contactless vital-sign sensing because it can measure a small displacement of the body surface without contact. However, most of the conventional methods have limited evaluation indices and measurement conditions. A method of measuring body-surface-displacement signals by using frequency-modulated continuous-wave (FMCW) radar and reconstructing ECG signals using a convolutional neural network (CNN) is proposed. This study conducted two experiments. First, we trained a model using the data obtained from six subjects breathing in a seated condition. Second, we added sine wave noise to the data and trained the model again. The proposed model is evaluated with a correlation coefficient between the reconstructed and actual ECG signal. The results of first experiment show that their ECG signals are successfully reconstructed by using the proposed method. That of second experiment show that the proposed method can reconstruct signal waveforms even in an environment with low signal-to-noise ratio (SNR).
Hyunghoon KIM Jiwoo SHIN Hyo Jin JO
In various studies of attacks on autonomous vehicles (AVs), a phantom attack in which advanced driver assistance system (ADAS) misclassifies a fake object created by an adversary as a real object has been proposed. In this paper, we propose F-GhostBusters, which is an improved version of GhostBusters that detects phantom attacks. The proposed model uses a new feature, i.e, frequency of images. Experimental results show that F-GhostBusters not only improves the detection performance of GhostBusters but also can complement the accuracy against adversarial examples.
Yuanwei HOU Yu GU Weiping LI Zhi LIU
The fast evolving credential attacks have been a great security challenge to current password-based information systems. Recently, biometrics factors like facial, iris, or fingerprint that are difficult to forge rise as key elements for designing passwordless authentication. However, capturing and analyzing such factors usually require special devices, hindering their feasibility and practicality. To this end, we present WiASK, a device-free WiFi sensing enabled Authentication System exploring Keystroke dynamics. More specifically, WiASK captures keystrokes of a user typing a pre-defined easy-to-remember string leveraging the existing WiFi infrastructure. But instead of focusing on the string itself which are vulnerable to password attacks, WiASK interprets the way it is typed, i.e., keystroke dynamics, into user identity, based on the biologically validated correlation between them. We prototype WiASK on the low-cost off-the-shelf WiFi devices and verify its performance in three real environments. Empirical results show that WiASK achieves on average 93.7% authentication accuracy, 2.5% false accept rate, and 5.1% false reject rate.
Chenchen MENG Jun WANG Chengzhi DENG Yuanyun WANG Shengqian WANG
Feature representation is a key component of most visual tracking algorithms. It is difficult to deal with complex appearance changes with low-level hand-crafted features due to weak representation capacities of such features. In this paper, we propose a novel tracking algorithm through combining a joint dictionary pair learning with convolutional neural networks (CNN). We utilize CNN model that is trained on ImageNet-Vid to extract target features. The CNN includes three convolutional layers and two fully connected layers. A dictionary pair learning follows the second fully connected layer. The joint dictionary pair is learned upon extracted deep features by the trained CNN model. The temporal variations of target appearances are learned in the dictionary learning. We use the learned dictionaries to encode target candidates. A linear combination of atoms in the learned dictionary is used to represent target candidates. Extensive experimental evaluations on OTB2015 demonstrate the superior performances against SOTA trackers.
Thi Thu Thao KHONG Takashi NAKADA Yasuhiko NAKASHIMA
We introduce a hybrid Bayesian-convolutional neural network (hyBCNN) for improving the robustness against adversarial attacks and decreasing the computation time in the Bayesian inference phase. Our hyBCNN models are built from a part of BNN and CNN. Based on pre-trained CNNs, we only replace convolutional layers and activation function of the initial stage of CNNs with our Bayesian convolutional (BC) and Bayesian activation (BA) layers as a term of transfer learning. We keep the remainder of CNNs unchanged. We adopt the Bayes without Bayesian Learning (BwoBL) algorithm for hyBCNN networks to execute Bayesian inference towards adversarial robustness. Our proposal outperforms adversarial training and robust activation function, which are currently the outstanding defense methods of CNNs in the resistance to adversarial attacks such as PGD and C&W. Moreover, the proposed architecture with BwoBL can easily integrate into any pre-trained CNN, especially in scaling networks, e.g., ResNet and EfficientNet, with better performance on large-scale datasets. In particular, under l∞ norm PGD attack of pixel perturbation ε=4/255 with 100 iterations on ImageNet, our best hyBCNN EfficientNet reaches 93.92% top-5 accuracy without additional training.
Xin ZENG Lin ZHANG Zhongqiang LUO Xingzhong XIONG Chengjie LI
In recent years, the development of visual tracking is getting better and better, but some methods cannot overcome the problem of low accuracy and success rate of tracking. Although there are some trackers will be more accurate, they will cost more time. In order to solve the problem, we propose a reinforced tracker based on Hierarchical Convolutional Features (HCF for short). HOG, color-naming and grayscale features are used with different weights to supplement the convolution features, which can enhance the tracking robustness. At the same time, we improved the model update strategy to save the time costs. This tracker is called RHCF and the code is published on https://github.com/z15846/RHCF. Experiments on the OTB2013 dataset show that our tracker can validly achieve the promotion of the accuracy and success rate.
Zhi WENG Longzhen FAN Yong ZHANG Zhiqiang ZHENG Caili GONG Zhongyue WEI
As the basis of fine breeding management and animal husbandry insurance, individual recognition of dairy cattle is an important issue in the animal husbandry management field. Due to the limitations of the traditional method of cow identification, such as being easy to drop and falsify, it can no longer meet the needs of modern intelligent pasture management. In recent years, with the rise of computer vision technology, deep learning has developed rapidly in the field of face recognition. The recognition accuracy has surpassed the level of human face recognition and has been widely used in the production environment. However, research on the facial recognition of large livestock, such as dairy cattle, needs to be developed and improved. According to the idea of a residual network, an improved convolutional neural network (Res_5_2Net) method for individual dairy cow recognition is proposed based on dairy cow facial images in this letter. The recognition accuracy on our self-built cow face database (3012 training sets, 1536 test sets) can reach 94.53%. The experimental results show that the efficiency of identification of dairy cows is effectively improved.
Convolutional Neural Network (CNN) has made extraordinary progress in image classification tasks. However, it is less effective to use CNN directly to detect image manipulation. To address this problem, we propose an image filtering layer and a multi-scale feature fusion module which can guide the model more accurately and effectively to perform image manipulation detection. Through a series of experiments, it is shown that our model achieves improvements on image manipulation detection compared with the previous researches.
Hiro TAMURA Kiyoshi YANAGISAWA Atsushi SHIRANE Kenichi OKADA
This paper presents a physical layer wireless device identification method that uses a convolutional neural network (CNN) operating on a quadrant IQ transition image. This work introduces classification and detection tasks in one process. The proposed method can identify IoT wireless devices by exploiting their RF fingerprints, a technology to identify wireless devices by using unique variations in analog signals. We propose a quadrant IQ image technique to reduce the size of CNN while maintaining accuracy. The CNN utilizes the IQ transition image, which image processing cut out into four-part. An over-the-air experiment is performed on six Zigbee wireless devices to confirm the proposed identification method's validity. The measurement results demonstrate that the proposed method can achieve 99% accuracy with the light-weight CNN model with 36,500 weight parameters in serial use and 146,000 in parallel use. Furthermore, the proposed threshold algorithm can verify the authenticity using one classifier and achieved 80% accuracy for further secured wireless communication. This work also introduces the identification of expanded signals with SNR between 10 to 30dB. As a result, at SNR values above 20dB, the proposals achieve classification and detection accuracies of 87% and 80%, respectively.
Hiroya YAMAMOTO Daichi KITAHARA Hiroki KURODA Akira HIRABAYASHI
This paper addresses single image super-resolution (SR) based on convolutional neural networks (CNNs). It is known that recovery of high-frequency components in output SR images of CNNs learned by the least square errors or least absolute errors is insufficient. To generate realistic high-frequency components, SR methods using generative adversarial networks (GANs), composed of one generator and one discriminator, are developed. However, when the generator tries to induce the discriminator's misjudgment, not only realistic high-frequency components but also some artifacts are generated, and objective indices such as PSNR decrease. To reduce the artifacts in the GAN-based SR methods, we consider the set of all SR images whose square errors between downscaling results and the input image are within a certain range, and propose to apply the metric projection onto this consistent set in the output layers of the generators. The proposed technique guarantees the consistency between output SR images and input images, and the generators with the proposed projection can generate high-frequency components with few artifacts while keeping low-frequency ones as appropriate for the known noise level. Numerical experiments show that the proposed technique reduces artifacts included in the original SR images of a GAN-based SR method while generating realistic high-frequency components with better PSNR values in both noise-free and noisy situations. Since the proposed technique can be integrated into various generators if the downscaling process is known, we can give the consistency to existing methods with the input images without degrading other SR performance.
Yasuhiro NAKAHARA Masato KIYAMA Motoki AMAGASAKI Qian ZHAO Masahiro IIDA
Low power consumption is important in edge artificial intelligence (AI) chips, where power supply is limited. Therefore, we propose reconfigurable neural network accelerator (ReNA), an AI chip that can process both a convolutional layer and fully connected layer with the same structure by reconfiguring the circuit. In addition, we developed tools for pre-evaluation of the performance when a deep neural network (DNN) model is implemented on ReNA. With this approach, we established the flow for the implementation of DNN models on ReNA and evaluated its power consumption. ReNA achieved 1.51TOPS/W in the convolutional layer and 1.38TOPS/W overall in a VGG16 model with a 70% pruning rate.
Object contour detection is a task of extracting the shape created by the boundaries between objects in an image. Conventional methods limit the detection targets to specific categories, or miss-detect edges of patterns inside an object. We propose a new method to represent a contour image where the pixel value is the distance to the boundary. Contour detection becomes a regression problem that estimates this contour image. A deep convolutional network for contour estimation is combined with stereo vision to detect unspecified object contours. Furthermore, thanks to similar inference targets and common network structure, we propose a network that simultaneously estimates both contour and disparity with fully shared weights. As a result of experiments, the multi-tasking network drew a good precision-recall curve, and F-measure was about 0.833 for FlyingThings3D dataset. L1 loss of disparity estimation for the dataset was 2.571. This network reduces the amount of calculation and memory capacity by half, and accuracy drop compared to the dedicated networks is slight. Then we quantize both weights and activations of the network to 3-bit. We devise a dedicated hardware architecture for the quantized CNN and implement it on an FPGA. This circuit uses only internal memory to perform forward propagation calculations, that eliminates high-power external memory accesses. This circuit is a stall-free pixel-by-pixel pipeline, and performs 8 rows, 16 input channels, 16 output channels, 3 by 3 pixels convolution calculations in parallel. The convolution calculation performance at the operating frequency of 250 MHz is 9 TOPs/s.
Masashi NISHIYAMA Michiko INOUE Yoshio IWAI
We propose an attention mechanism in deep learning networks for gender recognition using the gaze distribution of human observers when they judge the gender of people in pedestrian images. Prevalent attention mechanisms spatially compute the correlation among values of all cells in an input feature map to calculate attention weights. If a large bias in the background of pedestrian images (e.g., test samples and training samples containing different backgrounds) is present, the attention weights learned using the prevalent attention mechanisms are affected by the bias, which in turn reduces the accuracy of gender recognition. To avoid this problem, we incorporate an attention mechanism called gaze-guided self-attention (GSA) that is inspired by human visual attention. Our method assigns spatially suitable attention weights to each input feature map using the gaze distribution of human observers. In particular, GSA yields promising results even when using training samples with the background bias. The results of experiments on publicly available datasets confirm that our GSA, using the gaze distribution, is more accurate in gender recognition than currently available attention-based methods in the case of background bias between training and test samples.
Masayuki ODAGAWA Tetsushi KOIDE Toru TAMAKI Shigeto YOSHIDA Hiroshi MIENO Shinji TANAKA
This paper presents examination result of possibility for automatic unclear region detection in the CAD system for colorectal tumor with real time endoscopic video image. We confirmed that it is possible to realize the CAD system with navigation function of clear region which consists of unclear region detection by YOLO2 and classification by AlexNet and SVMs on customizable embedded DSP cores. Moreover, we confirmed the real time CAD system can be constructed by a low power ASIC using customizable embedded DSP cores.
Isana FUNAHASHI Taichi YOSHIDA Xi ZHANG Masahiro IWAHASHI
In this paper, we propose an image adjustment method for multi-exposure images based on convolutional neural networks (CNNs). We call image regions without information due to saturation and object moving in multi-exposure images lacking areas in this paper. Lacking areas cause the ghosting artifact in fused images from sets of multi-exposure images by conventional fusion methods, which tackle the artifact. To avoid this problem, the proposed method estimates the information of lacking areas via adaptive inpainting. The proposed CNN consists of three networks, warp and refinement, detection, and inpainting networks. The second and third networks detect lacking areas and estimate their pixel values, respectively. In the experiments, it is observed that a simple fusion method with the proposed method outperforms state-of-the-art fusion methods in the peak signal-to-noise ratio. Moreover, the proposed method is applied for various fusion methods as pre-processing, and results show obviously reducing artifacts.